14 research outputs found
Maximizing Crosstalk-Induced Slowdown During Path Delay Test
Capacitive crosstalk between adjacent signal wires in integrated circuits may lead to noise or a speedup or slowdown in signal transitions. These in turn may lead to circuit failure or reduced operating speed. This thesis focuses on generating test patterns to induce crosstalk-induced signal delays, in order to determine whether the circuit can still meet its timing specification. A timing-driven test generator is developed to sensitize multiple aligned aggressors coupled to a delay-sensitive victim path to detect the combination of a delay spot defect and crosstalk-induced slowdown. The framework uses parasitic capacitance information, timing windows and crosstalk-induced delay estimates to screen out unaligned or ineffective aggressors coupled to a victim path, speeding up crosstalk pattern generation. In order to induce maximum crosstalk slowdown along a path, aggressors are prioritized based on their potential delay increase and timing alignment. The test generation engine introduces the concept of alignment-driven path sensitization to generate paths from inputs to coupled aggressor nets that meet timing alignment and direction requirements. By using path delay information obtained from circuit preprocessing, preferred paths can be chosen during aggressor path propagation processes. As the test generator sensitizes aggressors in the presence of victim path necessary assignments, the search space is effectively reduced for aggressor path generation. This helps in reducing the test generation time for aligned aggressors. In addition, two new crosstalk-driven dynamic test compaction algorithms are developed to control the increase in test pattern count. The proposed test generation algorithm is applied to ISCAS85 and ISCAS89 benchmark circuits. SPICE simulation results demonstrate the ability of the alignment-driven test generator to increase crosstalk-induced delays along victim paths
Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers
Quantization scale and bit-width are the most important parameters when
considering how to quantize a neural network. Prior work focuses on optimizing
quantization scales in a global manner through gradient methods (gradient
descent \& Hessian analysis). Yet, when applying perturbations to quantization
scales, we observe a very jagged, highly non-smooth test loss landscape. In
fact, small perturbations in quantization scale can greatly affect accuracy,
yielding a accuracy boost in 4-bit quantized vision transformers
(ViTs). In this regime, gradient methods break down, since they cannot reliably
reach local minima. In our work, dubbed Evol-Q, we use evolutionary search to
effectively traverse the non-smooth landscape. Additionally, we propose using
an infoNCE loss, which not only helps combat overfitting on the small
calibration dataset ( images) but also makes traversing such a highly
non-smooth surface easier. Evol-Q improves the top-1 accuracy of a fully
quantized ViT-Base by , , and for -bit, -bit,
and -bit weight quantization levels. Extensive experiments on a variety of
CNN and ViT architectures further demonstrate its robustness in extreme
quantization scenarios. Our code is available at
https://github.com/enyac-group/evol-qComment: arXiv admin note: text overlap with arXiv:2211.0964
Run-Time Efficient RNN Compression for Inference on Edge Devices
Recurrent neural networks can be large and compute-intensive, yet many
applications that benefit from RNNs run on small devices with very limited
compute and storage capabilities while still having run-time constraints. As a
result, there is a need for compression techniques that can achieve significant
compression without negatively impacting inference run-time and task accuracy.
This paper explores a new compressed RNN cell implementation called Hybrid
Matrix Decomposition (HMD) that achieves this dual objective. This scheme
divides the weight matrix into two parts - an unconstrained upper half and a
lower half composed of rank-1 blocks. This results in output features where the
upper sub-vector has "richer" features while the lower-sub vector has
"constrained features". HMD can compress RNNs by a factor of 2-4x while having
a faster run-time than pruning (Zhu &Gupta, 2017) and retaining more model
accuracy than matrix factorization (Grachev et al., 2017). We evaluate this
technique on 5 benchmarks spanning 3 different applications, illustrating its
generality in the domain of edge computing.Comment: Published at 4th edition of Workshop on Energy Efficient Machine
Learning and Cognitive Computing for Embedded Applications at International
Symposium of Computer Architecture 2019, Phoenix, Arizona
(https://www.emc2-workshop.com/isca-19) colocated with ISCA 201
PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices
The ability to accurately predict deep neural network (DNN) inference
performance metrics, such as latency, power, and memory footprint, for an
arbitrary DNN on a target hardware platform is essential to the design of DNN
based models. This ability is critical for the (manual or automatic) design,
optimization, and deployment of practical DNNs for a specific hardware
deployment platform. Unfortunately, these metrics are slow to evaluate using
simulators (where available) and typically require measurement on the target
hardware. This work describes PerfSAGE, a novel graph neural network (GNN) that
predicts inference latency, energy, and memory footprint on an arbitrary DNN
TFlite graph (TFL, 2017). In contrast, previously published performance
predictors can only predict latency and are restricted to pre-defined
construction rules or search spaces. This paper also describes the EdgeDLPerf
dataset of 134,912 DNNs randomly sampled from four task search spaces and
annotated with inference performance metrics from three edge hardware
platforms. Using this dataset, we train PerfSAGE and provide experimental
results that demonstrate state-of-the-art prediction accuracy with a Mean
Absolute Percentage Error of <5% across all targets and model search spaces.
These results: (1) Outperform previous state-of-art GNN-based predictors
(Dudziak et al., 2020), (2) Accurately predict performance on accelerators (a
shortfall of non-GNN-based predictors (Zhang et al., 2021)), and (3)
Demonstrate predictions on arbitrary input graphs without modifications to the
feature extractor
Eff ect of participatory women’s groups facilitated by Accredited Social Health Activists on birth outcomes in rural eastern India: a cluster-randomised controlled trial
Background A quarter of the world’s neonatal deaths and 15% of maternal deaths happen in India. Few
community-based strategies to improve maternal and newborn health have been tested through the country’s
government-approved Accredited Social Health Activists (ASHAs). We aimed to test the eff ect of participatory
women’s groups facilitated by ASHAs on birth outcomes, including neonatal mortality.
Methods In this cluster-randomised controlled trial of a community intervention to improve maternal and newborn
health, we randomly assigned (1:1) geographical clusters in rural Jharkhand and Odisha, eastern India to intervention
(participatory women’s groups) or control (no women’s groups). Study participants were women of reproductive age
(15–49 years) who gave birth between Sept 1, 2009, and Dec 31, 2012. In the intervention group, ASHAs supported
women’s groups through a participatory learning and action meeting cycle. Groups discussed and prioritised maternal
and newborn health problems, identifi ed strategies to address them, implemented the strategies, and assessed their
progress. We identifi ed births, stillbirths, and neonatal deaths, and interviewed mothers 6 weeks after delivery. The
primary outcome was neonatal mortality over a 2 year follow up. Analyses were by intention to treat. This trial is
registered with ISRCTN, number ISRCTN31567106.
Findings Between September, 2009, and December, 2012, we randomly assigned 30 clusters (estimated population
156 519) to intervention (15 clusters, estimated population n=82 702) or control (15 clusters, n=73 817). During the
follow-up period (Jan 1, 2011, to Dec 31, 2012), we identifi ed 3700 births in the intervention group and 3519 in the
control group. One intervention cluster was lost to follow up. The neonatal mortality rate during this period was
30 per 1000 livebirths in the intervention group and 44 per 1000 livebirths in the control group (odds ratio [OR] 0.69,
95% CI 0·53–0·89).
Interpretation ASHAs can successfully reduce neonatal mortality through participatory meetings with women’s groups.
This is a scalable community-based approach to improving neonatal survival in rural, underserved areas of India